00:00:00

Structured Data on the Web for Physical Samples

Version June 2019

Doug Fils

Pressing 2 will display these notes

Notes

None at this time

Outline

  • Testing structured data for the web
  • Leveraging work in Project 418 / 419
  • An exercise in scaling
  • Cost benefit analysis

Work in RDA, other EC project such as Linked.Earth ( LiPD ) and others of potential interest. Also THROUGHPUT

Notes

IGSN e.V. and Structured Data on the Web

  • Leverage the growing momentum behind structured data on the web for samples
  • EarthCube Project 418 / 419 as a learning experience
  • An exercise mostly around scaling
  • Benefit argument
    • Easy to integrate into other graphs from this pattern
    • Leverage the web architecture; its scale and redundancy

Notes

Project 418 / 419 Gleaner

Notes

An experiment in code

  • Test build with 3 million samples
  • How to better address scale?
    • Exploit the sitemap LASTMOD date node
    • Indexers will need to work on implementation patterns for leveraging LASTMOD
    • Explore http 2.0 and content negotiation to see their impact

Notes

An experiment in vocabularies

  • The IGSN needs a vocabulary to support this. Build off the XML schema work to date
  • Leverage existing vocs like;
    • SOSA
    • DCAT
    • PROV
    • schema.org
    • more...
  • This helps bring FAIR principles to samples

Notes

Other vocabulary development

  • schema.org plus extensions
  • semantic interoperability; collaborations with ESIP, CSIRO, schema.org (DCAT) others...
  • SHACL repo

Notes

Reason for this test

  • To see if this approach used for data sets (100Ks) can be used for samples (100Ms)
  • Test of implementations to help discover red flags
    • sitemap scaling... (50K^s = 2.5B)
    • server scaling
    • caching
    • more...
  • Benefits the whole ecosystem of structured data on the web for geosciences

Notes

Thank you

Notes